Annotating Qualia Relations in Italian and French Complex Nominals
نویسندگان
چکیده
The goal of this paper is to provide an annotation scheme for compounds based on generative lexicon theory (GL, Pustejovsky, 1995; Bassac and Bouillon, 2001). This scheme has been tested on a set of compounds automatically extracted from the Europarl corpus (Koehn, 2005) both in Italian and French. The motivation is twofold. On the one hand, it should help refine existing compound classifications and better explain lexicalization in both languages. On the other hand, we hope that the extracted generalizations can be used in NLP, for example for improving MT systems or for query reformulation (Claveau, 2003). In this paper, we focus on the annotation scheme and its on going evaluation.
منابع مشابه
Representing Italian Complex Nominals: A Pilot Study
A corpus-based investigation of Italian Complex Nominals (CNs), of the form N+PP, which aims at clarifying their syntactic and semantic constitution, is presented. The main goal is to find out useful parameters for their representation in a computational lexicon. As a reference model we have taken an implementation of Pustejovsky’s Generative Lexicon Theory (1995), the SIMPLE Italian Lexicon, a...
متن کاملCLIPS, a Multi-level Italian Computational Lexicon: a Glimpse to Data
CLIPS is a multi-layered Italian computational lexicon based on the PAROLE-SIMPLE model. In this paper we briefly recall the main characteristics of the model and devote our attention to issues emerging from the encoding of large quantities of data, especially in relation to those types of syntactic and semantic information specific to our lexicon and that reflect innovative features of the und...
متن کاملAutomatic identification of semantic relations in Italian complex nominals
This paper addresses the problem of the identification of the semantic relations in Italian complex nominals (CNs) of the type N+P+N. We exploit the fact that the semantic relation, which is underspecified in most cases, is partially made explicit by the preposition. We develop an annotation framework around five different semantic relations, which we use to create a corpus of 1700 Italian CNs,...
متن کاملStudying the role of Qualia Relations for Word Sense Disambiguation
This paper studies the importance of qualia relations for Word Sense Disambiguation (WSD). We use a graph-based WSD algorithm over the Italian WordNet and evaluate it when adding different kinds of qualia relations (agentive, constitutive, formal and telic) taken from PAROLE-SIMPLE-CLIPS (PSC), a Language Resource based on the Generative Lexicon theory. Some qualia relations, specially telic, a...
متن کاملFrom Glosses to Qualia: Qualia Extraction from Senso Comune
This paper describes a case study on methods for automatically extracting qualia relations from dictionary glosses in Italian, namely the Senso Comune De Mauro Dictionary (SCDM). The qualia extraction has been addressed by means of a pattern-based approach and lexical match with an Italian generative lexicon based language resource, PAROLE-SIMPLECLIPS (PSC). The evaluation of the extraction app...
متن کامل